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f*^ , We introduce a model of thermalized conformations in space of RNA -or single stranded DNA- 

molecules, which includes the possibility of hairpin formation. This model contains the usual sec- 
' ondary structure information, but extends it to the study of one element of the ternary structure, 

namely the end-to-end distance. The computed force-elongation characteristics is in good agreeme- 
ment with some recent measurements on single stranded DNA molecules. 
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Recent progress in the manipulation of single biomolecules is making gradually accessible a wealth of interesting 
physical information. One of the basic investigations concerns the force-elongation characteristics: its measurement 
in double stranded DNA (dsDNA) molecules has provided very interesting results in the last few years, going from a 
detailed characterization of the elastic properties of the molecules to the existence of new phases of dsDNA in various 
regimes of tension and overcoiling jl]- 1 1 1 . 



While the force-elongation characteristics of dsDNA is rather well understood, the corresponding knowledge on 
single stranded DNA (ssDNA) is poorer: although in some ionic conditions it may be characterized by a simple freely 
jointed chain (FJC) with elastic bonds ||], this description is not valid when one changes the ionic concentrations Jl2| . 
This discrepancy is probably due to the formation of secondary structures in the ssDNA molecule |l2|], which can 
bend back onto itself and form local helices where complementary bases A-T and G-C are paired, gaining an energy 
of several kT per pair. 

The formation of secondary structures is a crucial step in the folding of proteins and single stranded nucleic 
acid polymers. Its importance stems from the rather large values of the binding energy involved in this formation, 
compared to the much smaller energy scale of the interaction between secondary structures which govern the final 
\Q ' three-dimensional shape of the molecules (the ternary structure). As discussed recently |l^-|l5||, the formation of 
secondary structures in RNA (which is very similar to the one in ssDNA) provides a wonderful laboratory for detailed 
' studies of some of the basic mechanisms at work in heteropolymer folding. 

In this paper we modify and extend the previous studies on RNA or ssDNA secondary structures in order to 
include one simple aspect of the ternary structure, namely the thermal fluctuations of the end-to-end distance, and 
its dependence on the pulling force. Our model, which is solved exactly with generating function techniques, provides 
a detailed description of the elastic properties of these polymers. It involves three parameters: the persistence length 
of the molecule, the elastic constant characteristic of bond stretching, and the pair binding energy. 

In the simplest approximation, the backbone of the polymer is described by a FJC with N elastic bonds. At thermal 
equilibrium, the probability distribution of a bond to be equal to the vector r is given by 
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M(r) = Cexp^-MlL_^j , (1) 

\ where b is the persistence length and I is a length which characterizes the elasticity of the bond. For RNA or ssDNA, 
one expects b to be of the order of a few times the distance between successive bases, and Ijb to be much smaller than 
one. The spatial conformation is thus described by the positions vecri, (i e {1, N + 1}) of the N + 1 nodes which 
are the articulation points of the chain. The attraction between complementary bases creates an effective potential 
~ between nodes i and j (arbitrarily far away from each other along the backbone) which involves a short 
ranged attraction and a core repulsion. We perform a standard virial expansion of the partition function in terms of 
the quantities fij(r) — exp(— (r)/kT) — 1 which vanish for \f\ > a, where a is the range of the interaction. The 
secondary structure is characterized by the set of node pairs i,j such that fij 0. 

Our main approximation for describing the secondary structure is the standard one in which one keeps only the 
nested diagrams @f9]Jl|-|l|, which are defined as follows: 



Each node can be paired to at most one other node. 



1 



• Two node pairs i < j and k < I (with say i < k) can coexist only if they are either independent (i < j < k < I) 
or nested (i < k < I < j). 

This condition neglects the formation of pseudo-knots, which are known to be rare in RNA folding. This is thus the 
simplest approximation, one in which one adds to the basic elastic model (here for instance the FJC) the possibility 
of formation of hairpins, consisting of helices, and helices within helices organised in a hierarchical way. In the future 
it should be possible to generalize our model in order to include some sets of pseudo-knots, as was done for instance 



in simulations in 20 1 . Including more refined potentials to get a better description of the secondary structure is also 
possible PJT1 
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FIG. 1. Recursion relation for the partition function. A thin line denotes one bond, which in the elastic FJC is a vector 
chosen with probability (|l|). A dashed line between i and j corresponds to an interaction term exp (— eij(f)/kT) — 1. The full 
line is the partition function which adds up the effect of all nested or independent interaction lines. 

The hierarchical structure of the retained diagrams makes it possible to write a recursion relation for the partition 
function Zjj(f) which describes the set of nodes k G + 1, j}, with an end-to-end distance fj — r*; = r. The 
recursion is explained in fig. [l] which shows that, when j i — i > 3: 



= / du(x(u) Zj-i ti (f- u) + f j% {r) J du\ fi(u{) du 2 (j,(u2)Zj-i,i+i(f- "i - u 2 ) 

/ Y[ du m n{u m ) dv f jk (v)Zj_ 1<k+1 (f - u 2 ~ u 3 )Z k -i ti (r- u x - v) (2) 
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This recursion relation provides the definition of our model for RNA or ssDNA folding. It modifies the recursions 
which have been written previously in the studies of RNA secondary structures 113] in two aspects. On the one hand 
it includes the spatial structure, i.e. the positions of the nodes. Secondly it uses the virial expansion in which the 
interaction term between i and j is given by fij{r). This is needed in order to get back the usual FJC in the limit 
where the interaction potential e vanishes. 

As a first step in the study of this model, we investigate in the following the case where the interaction energy ey (r) 
is independent of the pair This amounts to using an effective interaction, averaged over the several bases included 
within the persistence length b, in which the only effect of the sequence which is kept is the global concentration in 
the various base pairs. This rough approximation turns out to be good enough for computing global properties of the 
polymer like the force elongation characteristics. The effect of sequence heterogeneities, which is crucial for dyanmical 
properties, is left for future studies. 

In the homogeneous case the partition function Zj^(f) depends only on n = j — i, and is denoted by Z n [f). We 
introduce the Fourier transform of the generating function of the Z' n s: 



E(C,p) = J df(^Z n (r)C^j 



(3) 



which is expressed in terms of the Fourier transforms: 

a(p) = J dr ^(ry p " 

f(f) = [ dr [exp(-/?e(f)) - 1] . (4) 



Using Zo(p) = 1 and Zi(p) = u(p) one derives from the recursion relation (||) the functional equation: 

Cl-a(pMCp) ' 
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where the kernel u> satisfies the integral equation: 



^(C,p1 = C + C 3 / 7T^/(P-9>(9) 2 s(C,g) • (6) 
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The force-elongation characteristics for a chain with N bonds can be deduced from the partition function in presence 
of a force: 

Z%=\ dfZ N (f)e^- r \ (7) 



Its generating function is nothing but E(£,pf), where pp is an imaginary momentum given by pp = (0,0,— i/3F) 
for a force F pulling in the third direction. For a long chain, N » 1, one expects a partition function behaving 
as Zfj ~ Aexp(— (3N(f)(F))/N a . The free energy per bond <fi(F) determines the radius of convergence of the series 
defining the generating function S(£,pf). It is thus equal to <fi(F) = (1/(3) ln(£*), where C* is the singularity of S(£) 
which is the nearest to the origin. From the free energy per bond one deduces the elongation L along the axis of the 
force: L = —Nd(j)/dF, as well as the average fraction of pairings n p (defined as the number of pairings divided by 
N): n p — <91n(C*)/<91n( 7 ). 

The integral equation (^|) is easily solved in the case where the range of the interaction potential is small compared 
to b (this approximation is again valid when b is much larger than the interbase distance). One can then neglect the 
momentum dependence of / and substitute f(p) by the constant 76 s , where 7 is a dimensionless number characteristic 
of the strength of the pairing and defined by 7 = /(0) /b 3 = Jdr/b 3 [exp(— (3e(r)) — 1] . The kernel to is then momentum 
independent. The relation (^) between £ and to can be written as u> = £ + C, 2 A(uo), where the function A(u>) is 
monotonously increasing and such that A' (uj = 1) = 00. One can then show that w(C) has a second order branching 
point at Cbp and is analytic for \(\ < Ct>p, where Cbp is the maximum of the function (-1 + y/l + 4ujA(uj))/ (2A(ui)). 

The singularities of S which control the large n behavior of Z n are the branching point of w(C) at Cbp and the 
pole at Cp(p) determined by the vanishing of the denominator of equation (|^), when the momentum is equal to pp: 
ui(C p )ct(pf) = 1 • For purely elastic bonds with £ << b, one finds a(p) ~ [sm(pb)/pb] exp(—p 2 £ 2 /2) , and the pole is 
located at: 

, ,(t \ - r -0 2 F 2 i 2 /2 /on 

w(Cp) " sinh(^6) e ■ (8) 

To each of the two singularities is associated one phase of the model. The "hairpinned" phase corresponds to the 
branching point singularity. As far as we neglect the momentum dependence of u> (i.e. for small interaction radius) 
the free energy per bond <fi(F) — (1/(3) log Cbp is force independent. The length of the polymer is thus of order in 
the long chain (N — > 00) limit. A fraction n p of nodes is paired with n p independent on the applied force. The chain 
is bent in a few, i.e. O(N ), hairpins, each one involving O(N) bonds. The "elongated" phase corresponds to the pole 
singularity. The free energy <fi(F) = (l/f3) logCp(pV) is force dependent and the elongation is extensive (proportional 
to N). This can be written as: L(F) = ni ICC (F)Lpjc(F) where Lpjc{F) is the elongation without interaction (i.e. 
in the case 7 = 0) and rif vcc (F) the fraction of nodes which do not belong to any hairpin. The fraction of pairings 
rapidly decreases with the applied force. The number of hairpins is O(N). 

In our model there exists a second order phase transition between the hairpinned phase at low force and the 
elongated phase at high force. This phase transition is a robust feature of the model which does not depend on the 
details of the interaction potential and of the bond stretching potential: the branched point singularity, associated 
with the hairpinned phase, is present as soon as o(p) ~ 1 — np 1 for small |p|, which is the generic situation. The pole 
singularity, associated with the elongated phase, is always present. The boundary between the two phases occurs at a 
critical force ^(7) which increases monotonically with 7. Slightly above the threshold the elongation grows linearly 
with the force: L{F) oc F — F c (j). The asymptotic behaviors of the dimensionless critical force are: 

[3bF c (j) - i log( 7 ) for 1< 7 < e b ' 2 / e2 

PbFcfr) ~ for 7 «1 (9) 

Notice that the linear dependence of F c at small 7 a prediction which is independent of the detailed form of the bond 
probability distribution (|l|). 

Equations (^) and (|^) can be easily solved numerically. We compared our theoretical predictions with the exper- 
imental data presented in Ref. [O] on the In force versus elongation characteristic for a charomid-ssDNA at room 
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temperature under different salinity conditions. Using the elastic model for bond stretching (|l|), our three fitting 
parameters are: the persistence length b, the elasticity £, the interaction parameter 7. 
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FIG. 2. Fit of the force-elongation characteristics of charomid ssDNA. The circles are the experimetal data of Ref [12]. The 
continuous line is the best fitting curve obtained with our model model. The dashed curve is the FJC characteristics obtained 
by switching off the interaction. The difference between the two is due to the formation of hairpins. 



As shown in hg.g, we obtain a good agreement with the experimental curve at the highest salt concentration (10 
mM PB, 5 mM Mg). The small elongation region (L/L < .1) of the experiment was not considered since the 
interactions of the molecule with the glass plate cannot be ignored (this forbids a study of the critical force region 
with the present data). The fitting parameters are: the persistence length b, the elasticity £, the interaction parameter 
7. The number of bonds was fixed as in Q such that Nb — 1.6875 L , where L is the crystallographic length of 
the double stranded DNA. A least square fit, yields the following results: b = 19.2 A, 7 = 1.89, t = 1.01 A. The 
orders of magnitudes of the various parameters are correct. The persistence length is of the order of three times the 
interbase distance 60 (our approximation of a large value of b > bo is marginally self-consistent and should be improved 
upon in the future). The value of £, when expressed in terms of the enthalpic elasticity S as in Jl^|, corresponds to 
S = b/((3l 2 ) ~ 1000 pN, typical of the values measured at higher forces |i 
The value of 7 is characteristic of the strength of the interaction. For a potential well of width a and depth eo, one 
has 7 ~ (a/6) 3 exp(/3eo), which is compatible with some typical values such as a ~ 4 A, ep ~ 2.9 kT. 



10 O]; the approximation I << b is valid. 
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FIG. 3. The fraction of paired nodes in the secondary structure as a function of the external pulling force. The three curves 
refer to three different values of the interaction parameter 7. From top to bottom 7 = 3.9, 1.9, 0.59. The other parameters of 
the model correspond to the experimental situation: they are fixed as in the fit of Fig. 0. 
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From our computation one can deduce the pairing fraction n p in the conditions of the experiment. This is plotted in 
Fig. ||. It is clear that in the region of forces above 10 pN there is no pairing. This is consistent with the measurements 
of ref. JO] which showed that the charcteristics of two different ssDNA's with different G-C concentration merge in 
that region. One should keep in mind that the two fitting parameters b and £ are basically fixed by this high force 
region where there is no pairing. The low force part with pairing is the one which fixes the binding parameter 7. 

When the salinity is lowered some new physical effects become relevant. The electrostatic interactions between the 
bases are less effectively screened, and probably the FJC is not a good model. Our model should still give a reasonable 
account of the effect of secondary structures on the elongation. One possibility to test it is to use, instead of the elastic 
FJC, the experimental force elongation characteristics measured on a molecule exposed first to a chemical treatment 
(for instance glyoxal) which destroys the ability of the bases to pair. Our model should then allow to deduce from 
two experimental curves (one with glyoxal, the other without) the effect of the secondary structures . 

In this paper we have introduced a solvable model of the structure of ssDNA or RNA molecules which includes, 
together with the secondary structure, one important element of its ternary structure. The model gives a general 
framework for including the effect of hairpin formation in the elongation properties of the molecules. When used with 
a simple FJC model for the polymer without hairpins, it is in good agreement with the experimental data at high 
ionic concentration. Several extensions of this study are natural. The description of data obtained at smaller ionic 
concentration requires to go beyond the FJC approximation. Another natural extension of our study, also possible 
within this model, is to study the effects of the disorder in the sequence of bases. 
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